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Abstract: [ Aim] This study aims to explore the genetic diversity, genetic differentiation, and phylogenetic 
relationships among populations of the pierid butterfly Colias fieldii in China, to infer their origin and 
divergence time, and to preliminarily clarify the causes of their spatiotemporally evolutionary history. 
[ Methods] The four mitochondrial gene (COI, Cytb, NDI and ND5) sequences of 115 individuals from 23 
geographic populations of C. fieldii in China collected in 2006 -2018 were amplified by PCR and sequenced. 
The genetic diversity and genetic differentiation were analyzed using MEGA v.7.0, DnaSP v.5.0, Arlequin v. 
3.5.1 and other genetic analysis software. Using the closest relatives as the outgroups, the phylogenetic trees 
and haplotype median-joining network of C. fieldii were reconstructed with such analytical software as IQ- 
TREE, MrBayes v.3.1.2, Network v.4.6 and BEAST v. 1.8.3, and the origin and divergence time of C. 
fieldii were estimated by using relaxed molecular dating method and calibrations of the previous studies. Based 
on the present biogeographic distribution of C. fieldii and the main earth environmental events since the 
Quaternary Period, the spatio-temporal pattern of its biogeographic distribution and the underlying earth 
environmental factors were tentatively inferred. [Results] The aligned sizes of mitochondrial gene segments of 
COI, Cytb, NDI and NDS of C. fieldii populations are 648, 699, 393 and 777 bp, respectively, and the 
concatenated sequence of the four genes is 2 517 bp in length, which was shown to be significantly AT biased. 
In total, 18 haplotypes based on the four mitochondrial gene sequences were found in 115 individuals of 23 
geographic populations of C. fieldii, with the haplotype diversity (Hd) of 0. 677 + 0. 048 and nucleotide 
diversity ( m) of 0. 00066 + 0. 00007 of the total population, showing a relatively high level of haplotype 
diversity and a low level of nucleotide diversity. The phylogenetic analysis showed that 18 haplotypes of C. 
fieldii populations were distinctly categorized into two large clades (clade I and II). Clade I included 13 
haplotypes of populations from Shaanxi, Henan, Gansu, Anhui, Hubei, Sichuan, Qinghai, and some regions 
of Yunnan. Clade II included five haplotypes of populations from some regions of Yunnan and Tibet. The results 
of reconstructed haplotype median-joining network were generally consistent with those of the phylogenetic tree. 
AMOVA analysis indicated that a larger level of population differentiation (64.36% ) occurred between the two 
haplotype clades and a subtle genetic differentiation (35.64% ) existed within each haplotype clade of C. 
fieldii. The analysis results of neutrality tests and mismatch distribution indicated that populations with 
haplotypes in clade I did not experience a population expansion event, whereas those with haplotypes in clade II 
probably had a sudden demographic expansion at about 0.085 Ma in the late Pleistocene, a little earlier than 
the Last Glacial Maximum event, which may be caused by the warm and humid plateau climate in the 
interglacial period, the expansion of forest grassland and the decreased heavy rainfall on the core plateau. 
[ Conclusion] The genetic differentiation of C. fieldii populations is correlated significantly with the 
geographical distance. In addition, we proposed that C. fieldii populations originated at about 0.48 Ma in 
southwestern areas of China (presently the Hengduan Mountains and adjacent areas) , and began to diversify 
into two clades and later dispersed into other low-latitude areas due to the Quaternary glacial-interglacial cycle 
events, Southeast Asia monsoon and different habitat environments. 

Key words: Colias fieldii; genetic differentiation; haplotype; mitochondrial gene; molecular dating; 
phylogeography; China 
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1 INTRODUCTION 


Colias fieldii is one pierid butterfly species 
commonly found and distributed in Central Asia, 
inhabiting at areas of grassy and wooded steppes 
(Laiho and Stahls, 2013; Ding and Zhang, 2017). 
As one of the important pollinator insects, it plays a 
critical role in plant reproduction ( Yang et al., 
2019 ). 


studies have provided much useful evidence for 


Previous morphological and ecological 
taxonomy and phylogeny of this species and related 
taxa (Meng et al., 2013). In recent decades, more 
researchers attempted to investigate the relevant 
studies by using molecular data criteria. For 
example, Wheat and Watt (2008) reconstructed the 
phylogenetic of North 
American Colias taxa using mitochondrial genes, and 


molecular relationships 
their results suggested that molecular divergences 
among the Colias species were relatively small. 
Schoville et al. (2011) studied the evolutionary 
history of Colias behrii, showing that C. behrii had a 
very low level of genetic diversity at mitochondrial 
and nuclear loci. However, this study did not 
explore the impacts of Quaternary glacial period on 
distribution and genetic structure of the Colias 
species in the Qinghai-Tibet Plateau (QTP). 

The Qinghai-Tibet Plateau (QTP) in China, 
recognized as the world’ s third pole, is the highest 
and largest plateau in the world with an average 
elevation of about 4 000 m and an area of about 2.5 
million km’. Because of its high elevation and 
specific climate, this plateau dramatically affects 
terrestrial fauna and ecosystems in western China and 
the neighboring areas. Specifically, the rapid uplift 
of the QTP between 1. 1 and 0.6 Ma gave rise to the 
average altitude and initiated widespread mountain 
glaciers during the ice ages (Zheng, 1996; Zhou et 
al., 2006; Favre et al., 2015). The topographic 
variation of the QTP, including the forming of the 
Hengduan Mountains, coupled with cyclical climatic 
changes, namely the alternating glacial-interglacial 
periods in the Pleistocene exerted enormous 
influences on the current spatial distribution and 
genetic structure of many resident species (Cao et 
al., 2012; Li et al., 2012). Previous studies have 
demonstrated that the geological events as well as 
accompanying environmental changes might have 
caused rapidly evolutionary radiation of many insect 
groups (Gratton et al., 2008). C. fieldii is also 
found in the QTP, and its genetic differentiation is 
closely related to the environmental background of 
the QTP. 


Due to lack of recombination, maternal 


inheritance and other merits compared with nuclear 
genes, mitochondrial DNA (mtDNA) genes have 
been widely used to study molecular evolution, 
phylogenetics , and population genetics in animals for 
several decades ( Galtier et al., 2009). The 
cytochrome oxidase subunit I (COI) gene is 
relatively conservative and widely used in inferring 
inter- and intra-species phylogenetic relationships, 
and the cytochrome b ( Cytb) gene has a moderate 
rate of molecular evolution, being suitable to clarify 
the phylogenetic relationships between families, 
genera, species and even populations of a single 
addition, the genes 

subunit 5 gene ( ND5 ) and 
dehydrogenase subunit I gene (NDI) which harbor a 


species. In mitochondrial 


dehydrogenase 


relatively fast evolutionary rates, are suitable for 
resolving relatively lower level relationships 
(Cameron, 2014; He et al., 2016; Su et al., 
2017). In recent decades, combined mitochondrial 
dataset was often utilized to investigate the genetic 
differentiations and phylogenetic relationships among 
insects at the inter- and intra-species level (Tao et 
al., 2020). 

In this study, 
sequences of four mitochondrial genes ( COI, Cytb, 
NDI and NDS ) of total 115 individuals of 23 
populations of C. fieldii from the Qinghai-Tibet 
Plateau and other areas in China. Based on these 


we firstly determined the 


datasets of four mitochondrial gene sequences, we 


analyzed their genetic differentiation among 
populations, assessed their phylogeographic structure 
using a variety of genetic diversity analysis methods , 
and analyzed their demographic history using the 
mismatch distribution and neutrality test analysis 
methods, reconstructed their phylogenetic trees, and 
estimated the divergence time of their main lineages 
with relaxed molecular dating methods. In general, 
we aim to clarify their divergence timescales and the 
related geological and environmental events in the 


evolutionary history of C. fieldii. 


2 MATERIALS AND METHODS 


2.1 Sample collection and DNA extraction 

We collected 115 adult individuals of C. fieldii 
from 23 sampling sites in central and southwestern 
China. Sample size and geographical coordinates for 
After 


species identification, the fresh thorax muscle tissues 


each population were shown in Table 1. 


were immediately preserved in 100% ethanol for 
DNA fixation and stored at - 20°C for further 
genomic DNA isolation. Total genomic DNA was 
extracted from thorax muscle using a DNA extraction 
kit (Sangon Biotech, Shanghai) according to the 
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manufacturer’ s instructions. 
2.2 PCR amplification and sequencing 

Four mitochondrial gene ( COI, Cytb, NDI, 
and NDS ) fragments were amplified by using 
standard short primers (Simon et al., 1994; Yagi et 
al., 1999). All primers were synthesized by Sangon 
Biotech (Shanghai) shown in Table 2. PCR was 
performed in a PCR reaction (50 uL) of genomic 
DNA (20 ng/mL) 1.5 uL, 10 x Buffer 6.0 uL, 
Mg (25 mmol/L) 8.0 uL, dNTPs (0.2 mmol/L) 


1.5 uL, Taq DNA polymerase 0.6 uL, 1.8 uL of 
each of the primers (1.0 ng/mL) , and ddH,O 28.8 
uL, under the following conditions; an initial 
denaturation at 95°C for 5 min; 35 cycles of 
denaturation at 94°C for 50 s, annealing at 46 — 
52°C (depending on the primer pairs) for 50 s, and 
extension at 72°C for 120 s; and a final extension at 
72°C for 10 min. The PCR products were sequenced 
on an ABI-377 automatic DNA sequences ( Sangon 


Biotech, Shanghai). 


Table 1 Sampling information, haplotype diversity and nucleotide diversity estimated based on mitochondrial genes 
COI, Cytb, NDI, and NDS of Colias fieldii populations in China in this study 

















size latitude of individuals) Ha z 
BJ Baoji, Shaanxi 5 33.35°N, 106. 18°E Al(1), A15(4) 0. 400 0.00016 
BMS Bamishan, Gansu 5 35.53°N, 103.25°E A15(5) 0 0 
HBY Huangbaiyuan, Shaanxi 5 33.73°N, 107. 40°E A5(4), A15(1) 0. 400 0.00016 
LY Luoyang, Henan 5 33.35°N, 111.8°E A15(5) 0 0 
ELS Elashan, Qinghai 5 34.07°N, 103.41°E A4(4), AI5(1) 0.520 0. 00052 
GN Guanggaishan, Gansu 5 34.13°N, 103. 17°E A15(5) 0 0 
KJM Kajiaman, Gansu 5 35.05°N, 102.52°E A6(4), A15(1) 0. 400 0.00016 
HF Hefei, Anhui 5 31.30°N, 117.14°E A15(5) 0 0 
sY Shiyan, Hubei 5 32.65°N, 110.79°E A15(5) 0 0 
JZG Jiuzhaigou, Sichuan 5 34. 19°N, 104. 27°E A15(5) 0 0 
SNJ Shennongjia, Hubei 5 31.21°N, 110.03°E AI5(5) 0 0 
QCS Qingchengshan, Sichuan 5 30.54°N, 103.35°E A10(1), A11(1), A12(1), A15(2) 0. 900 0. 00079 
LJ Lijiang, Yunnan 5 26. 86°N, 100.25°E A3(1), A7(2), A15(1) 0. 800 0.00087 
DQ Deqin, Yunnan 5 29. 15°N, 99.32°E A3(2), A15(3) 0. 600 0.00119 
DL Dali, Yunnan 5 26.42°N, 101.03°E A2(3), A7(2) 0. 600 0. 00048 
MK Mangkang, Tibet 5 30.20°N, 99.05°E A15(5) 0 0 
XLZ Linzhi, Tibet 5 30.20°N, 91.24°E A18(5) 0 0 
LS Lhasa, Tibet 5 29.97°N, 91.11°E A14(5) 0 0 
SFG Shifogou, Gansu 5 36. 12°N, 103.45°E A13(2), A15(3) 0. 600 0. 00095 
LZ Lanzhou, Gansu 5 36.03°N, 103.40°E A8(1), A9(2), A15(2) 0. 800 0. 00040 
XY Yangling, Shaanxi 5 34.20°N, 107. 59°E A15(5) 0 0 
ZO Zhaqi, Tibet 5 29.28°N, 91.26°E A16(1), A17(2), A18(2) 0. 800 0. 00064 

Table 2 PCR primers used in this study 

Gene GenBank accession no. Primer sequences (5'—3’) Annealing temperature (°C ) 
oo wos 
Cytb MN305324 - MN305438 ii E E a 52.2 
人 = 
NDS MN305554 - MN305668 3 E B E P AT 52.8 


2.3 Phylogenetic analysis 
Using Gonepteryx rhamni and Eurema hecabe as 
the outgroups, phylogenetic trees were reconstructed 


with the Bayesian inference ( BI) and maximum 
likelihood ( ML ) methods based on different 


mitochondrial datasets. The jModelTest v. 2. 0 
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( Darriba et al., 2012) was used to select the GTR + 
I+ G model as the best-fit substitution model of 
molecular evolution. BI analysis using the Markov 
chain Monte Carlo (MCMC) method was performed 
by MrBayes v. 3. 1. 2 ( Huelsenbeck and Ronquist , 
2001; Ronquist and Huelsenbeck, 2003). ML 
analysis was carried out in IQ-TREE v. 1. 6. 8 under 
the GTR +I +G model ( Nguyen et al., 2014; Zhang 
et al., 2018). Finally, a ML tree in NEWICK 
format was visualized by FigTree v. 1. 4.3 ( Rambaut, 
2009 ). 
haplotypes was constructed with Network v.4.6 
( Bandelt et al., 1999 ). 


structure was tested by calculating two indices of 


In addition, median-joining network of 
The phylogeographic 


genetic differentiation G,, and Nor using DnaSP 
v. 5.0 (Pons and Petit, 1996; Librado and Rozas, 
2009). The genetic differentiation coefficient ( F,r) 
and gene flow (Nm) were calculated using Arlequin 
v.3.5.1 (Excoffier and Lischer, 2010). Analysis 
of molecular variance (AMOVA) was performed to 
distinguish variations within and among populations , 
with significance tested via 1 000 nonparametric 
permutations. 
2.4 Population demographic analysis 

To infer population demographic history of C. 
fieldii, Tajima’s D (Tajima, 1989) and Fu’s F, 
(Fu, 1997 ) were 
v. 3.5.1. 


used to identify historical processes between recent 


calculated using Arlequin 
In addition, mismatch distribution was 


demographic expansion and population equilibrium 
(Rogers and Harpending, 1992; Cristiano et al., 
2016). The validity of the two models fitting to our 
data was assessed by the sum of squared deviation 
(SSD) and Harpending’ s raggedness index ( HRI) 
1994 ). 


time in generations ) was then 


( Harpending, The time of population 
expansion (t, 
estimated using the formula 7 = 2ut ( You et al., 


2010 ) ， 


distribution and u was 


where r was the mode of mismatch 
the mutation rate per 
generation for the entire sequences under this study 
(the value of u was calculated using the equation 
u =k, where u was the mutation rate per nucleotide 
per generation and k was the number of nucleotides 
assayed ) . 
2.5 Divergence time estimation analysis 

Based on the 18 haplotypes, using Artogeia 
melete, Pieris rapae, Aporia intercostata and Aporia 
hippia as the outgroups, C. fieldii of this study and 
their related taxa of C. montium, C. erate, C. 
croceus, Catopsilia pomona, Gonepteryx mahaguru , 
Gonepteryx rhamni, and Eurema hecabe were used to 
and estimate the 


reconstruct the species tree 


divergence time simultaneously using the BEAST 


v.1.8.3 ( Drummond and Rambaut, 2007 ). 
Uncorrelated relaxed clocks were used to estimate 
the branch lengths, with the tree prior of the birth- 
death process, and the nucleotide substitution model 
was set to the GTR + G model. The oldest Pieridae 
libytheoidea 
proserpina were selected for fossil calibrations, and 


fossils  Stolopsyche and Coliates 
thus the minimum age of the genus Pieris and the 
genus Aporia were constrained to be 37.2 -33.9 Ma 
and 33. 5 - 30 Ma, respectively (Scudder, 1875, 
1889; Braby and Trueman, 2006). In addition, 
according to insect-host plant coevolutionary scenario 
and the molecular dating of the Brassicales’ s 
radiation, the earliest divergence time of Coliadinae 
and Pierinae was set to be 90 - 85 Ma. The 
convergence of the MCMC chains was evaluated by 
ESS values above 200 in Tracer v. 1.7.1 ( Rambaut 
et al., 2018). The maximum-clade-credibility tree 
was finally visualized and edited in FigTree v. 1.4.3 
(Carneiro et al., 2018). 
2.6 Data analysis 

The mitochondrial gene sequences of COI, 
Cytb, NDI, and NDS from all individuals of C. 
fieldii were aligned via MAFFT v. 7. 1 (Katoh et 
al., 2017) and concatenated by DAMBE v. 6. 0 
(Xia, 2017). The nucleotide composition of the 
mitochondrial gene sequences was analyzed by 
MEGA v.7.0 (Kumar et al., 2016). The haplotype 
diversity (Hd), haplotype number (h), and 
nucleotide diversity (77) were calculated using 


DnaSP v.5.0 (Librado and Rozas, 2009 ). 


3 RESULTS 


3.1 Genetic diversity 

Four mitochondrial genes including partial COI 
(648 bp), Cytb (699 bp), NDI (393 bp) and 
ND5 (777 bp) were amplified and obtained, and 
the concatenated sequence of the four genes is 2 517 
bp in length. The obviously AT-biased dataset with 
no indels or stop codons totally contained 18 
haplotypes (Al - A18 ) (Table 3), among which 
six were unique haplotypes, each found only in one 
population, and the remaining twelve haplotypes 
were shared among two or more different 
populations. The most abundant haplotype A15, 
shared by 64 individuals, was distributed in 18 
populations (Table 1). The calculated haplotype 
diversity (Hd) and nucleotide diversity (7) of the 
whole population were 0.677 +0. 048 and 0. 00066 + 
0.00007, respectively, with the haplotype diversity 
and nucleotide 


diversity of each geographic 


population shown in Table 1. 
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Table 3 Characteristics of haplotypes based on the 
concatenated sequence of mitochondrial genes COI, Cytb, 
NDI and NDS of Colias fieldii populations in China 





Haplotype TCU) C A G Toal AT-skew GC-skew 
Al 43.6 11.4 33.1 11.9 2517.0 -0.1368 0.0221 
A2 43.6 11.4 33.1 11.9 2517.0 -0.1368 0.0221 
A3 43.5 11.4 33.2 11.9 2517.0 -0.1352 0.0205 
A4 43.6 11.4 33.1 12.0 2517.0 -0.1374 0.0238 
A5 43.7. 11.3 33.1 11.9 2517.0 -0.1377 0.0256 
A6 43.6 11.4 33.1 11.9 2517.0 -0.1368 0.0221 
A7 43.6 11.4 33.1 11.9 2517.0 -0.1368 0.0221 
AS 43.6 11.4 33.1 11.9 2517.0 -0.1366 0.0222 
A9 43.6 11.4 33.1 11.9 2517.0 -0.1366 0.0222 
Al0 43.6 11.4 33.1 11.9 2517.0 -0.1368 0.0221 
All 43.6 11.4 33.1 11.9 2517.0 -0.1362 0.0239 
Al2 43.6 11.4 33.1 11.9 2517.0 -0.1366 0.0222 
Al3 43.6 11.4 33.1 11.9 2517.0 -0.1362 0.0205 
Al4 43.6 11.4 33.2 11.8 2517.0 -0.1356 0.0188 
Al5 43.6 11.4 33.1 11.9 2517.0 -0.1372 0.0239 
Al6 43.6 11.4 33.1 11.9 2517.0 -0.1362 0.0205 
AI7 43.5 11.4 33.2 11.8 2517.0 -0.1352 0.0171 
AIS 43.6 11.4 33.2 11.8 2517.0 -0.1361 0.0205 


3.2 Phylogenetic analysis 

Our phylogenetic analyses based on the Dataset 
1 showed that BI and ML trees harbored the same 
topologies. Both trees exhibited that C. fieldii of this 


A 


mJ mSFC 于 LS 
m BMS m DQ m ELS 
m HF mJZG m KJM 
LZ = MK m a 
mSY m XY m DL 
m SNJ m XLZ m BJ 
m70 = HBYm LY 
CN m WL 


Pp] = PP20.9 or BS=70 
= PP<0.9 or BS<70 
3.0 


Clade I 


XK PP=1.0 or BS =100 





| A16 © 
vo O 


Gonepteryx rhamni 


Eurema hecabe 


Fig. 1 


Clade II 


study was made up of two major clades (I and IT) 
with relatively strong supports (PP = 1.0, BS = 
100) (Fig. 1: A), which were compatible with the 
haplotype median-joining network analysis (Fig. 1: 
B). Clade I contained 13 haplotypes from Shaanxi, 
Anhui, Hubei, 
and some regions of Yunnan. Clade II consisted of 


Henan, Gansu, Sichuan, Qinghai, 


five haplotypes from some regions of Yunnan and 
Tibet. A7 and A2 of the 
population DL were detected in clades I and I, 
respectively. Clade I could be further divided into 
two subclades, each of which was not geographically 


Among 18 haplotypes, 


specific, while clade II only contained one subclade 
(Fig. 1; A). 

The comparison between the fixed indices Ner 
and Gsr revealed an obvious correlation between 
phylogeny and geography. The Nu value (0. 63323) 
was much larger than the Go, value (0. 18432) 
(P <0.01), indicating the existence of a significant 
phylogeographic The 
haplotype median-joining network showed that the 


structure. reconstructed 
haplotype A7 was probably the ancestral haplotype of 
clade I, which subsequently diverged into two 
subclades ( A3, A4, A7, A10 and A13; A5, A6, 
A8, All, A9, A12, A15 and Al, respectively ) , 
with A3, A7 and A15 being the shared haplotypes 
by populations. The haplotype A16 
should be considered as the ancestral haplotype of 


some 


ag Al A12 


A6 AS 








Clade I 


Clade II 


A18 
A17 


Phylogenetic tree ( A) of all the haplotypes of the concatenated sequence of mitochondrial genes COI, Cytb, NDI 


and NDS of Colias fieldii populations in China inferred using both maximum likelihood (ML) and Bayesian inference ( BI) 
methods and haplotype median-joining network (B) 
Bootstrap values (BS) lower than 50 and posterior probabilities (PP) lower than 0.5 were not shown. Haplotypes are represented as pie-diagrams with 


the slice-size proportional to the number of individuals as detailed in Table 1. 


Al - A18; Haplotypes 1 - 18. 
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clade II, with A14 and A18 shared each by two 
populations. Of all the populations in this study, 
populations from Yunnan had the highest number of 
haplotypes, with A15 being the most widely 
distributed one. 
3.3 Population differentiation 

The results of analysis of AMOVA showed that 
the genetic variation among and within populations 
were 64.36% and 35.64% , respectively, indicating 
a larger population differentiation between the two 
clades of C. fieldii and a subtle differentiation within 
each clade (Table 4). Meanwhile, the genetic 
differentiation coefficient ( Fs) of the total 
population was 0.608, and the gene flow (Nm) of 
the total population was 0. 160, indicating high 
population genetic differentiation (Fs, >0. 250) and 
relatively limited gene flow (Nm <1.000). The Fo 
and Nm values between the two clades were 0. 632 
and 0. 140, respectively. In addition, the calculated 
haplotype diversity (Hd) and nucleotide diversity 
(ar) of clade I were 0.511 +0. 064 and 0.00041 + 
0.00008, respectively, and those of clade II were 
0. 723 +0. 062 and 0.00046 +0. 00008, respectively. 


Table 4 Analysis of molecular variance (AMOVA) of 
different Colias fieldii populations in China based on 
the concatenated sequence of mitochondrial genes 
COI, Cytb, NDI and ND5 





Source of Lf Sum of Variance Percentage of 
variation aa squares components variation 
Among populations 1 35.311 0.94531 * 64. 36 
Within population 113 59.141 0. 52337 * 35.64 
Total 114 94.452 1. 46869 * 
* P <0. 001. 
A 
0.5 
mm Freq.Exp. 
04 oe Freq.Obs. 
a 
= 0.3 
T 
3 
5 
= 0.2 
0.1 





0 
0 2 4 6 8 10 12 14 16 18 20 22 


Pairwise mismatch 


3.4 Demographic history 

Demographic analysis based on the 
concatenated sequence of mitochondrial genes COT, 
Cytb, NDI and NDS showed that the two clades I and 
II of C. fieldii populations have undergone different 
demographic histories. The un-unimodal mismatch 
distribution and non-significantly negative Tajima’ s 
D value indicated that no obvious population 
expansion events happened in populations with 
haplotypes in clade I ( Table 5). Demographic 
expansion of populations with haplotypes in clade II 
was confirmed by the significantly negative Fu’ s Fs 
value (Fs = - 28.963, P =0), although Tajima’ s 
D value (D = 0. 160, P > 0. 10) was non- 
significantly positive. The mismatch distribution was 
unimodal (Fig. 2) and both SSD and HRI index 
tests failed to reject the hypothesis of a sudden 
expansion model ( Ramos-Onsins and Rozas, 2002; 


Excoffier, 2004; Li et al., 2016). 


Table 5 Results of neutrality test and mismatch analyses 
of Colias fieldii populations in China with haplotypes in 
clades I and II based on the concatenated sequence 
of mitochondrial genes COI, Cytb, NDI and NDS 





Parameter Clade I Clade II 
Tajima’ s D -1.390 0. 160 
Fu’s Fs - 30.089 * — 28.963 * 
SSD 0.00316 0. 00288 
HRI 0.09102 0. 07802 


Tau (95% CI) 0.090 (0.000 -2.314) 1.242 (0.623 -2. 172) 


Clade I contained 13 haplotypes ( A3, A4, A7, A10, A13, A5, A6, 
A8, All, A9, A12, A15 and Al) from Shaanxi, Henan, Gansu, 
Anhui, Hubei, Sichuan, Qinghai, and some regions of Yunnan. Clade II 
contained five haplotypes ( A2, A14, A16, A17 and A18) from some 
regions of Yunnan and Tibet. The same below. SSD: Sum of squared 
deviation; HRI; Harpending’ s raggedness index; * Significance at P <0. 05. 


B 
0.5 
=m Freq.Exp. 


04 ee Tred.Obs. 


Frequency 
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0 
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Fig. 2 Pairwise mismatch distributions of Colias fieldii populations in China with haplotypes in clades I (A) and II (B) 
based on the concatenated sequence of mitochondrial genes COI, Cytb, NDI and NDS 


Freq. Obs. and Freq. Exp. represent the observed and expected mismatch distributions of a stationary population, respectively. 


3.5 Divergence time estimation 

Our molecular dating results indicated that the 
first divergence between the subfamilies Coliadinae 
and Pierinae originated at about 86. 22 Ma [95% 


confidence interval (CI) = 83. 33 - 89.27 Ma] in 
the Late Cretaceous ( Santonian ), and the two 
subfamilies began to diverge at about 64. 75 Ma 
(95% CI =52. 15 - 77. 73 Ma) and 66. 84 Ma 
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(95% CI = 55. 29 - 77. 87 Ma) during the Late 
Cretaceous and the early Paleocene, respectively. 
For C. fieldii, the divergence time of the two 
evolutionary clades (clades I and IT) occurred at 


about 0.48 Ma (95% CI =0.24 -0.74 Ma) during 
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the middle Pleistocene (Fig. 3). Based on the value 
of 7 and substitution rate of 0.73 x 10 ”per site per 
year for four mitochondrial genes, the demographic 
expansion time of clade estimated at 
approximately 0.085 Ma during the Late Pleistocene. 
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Fig. 3 Evolutionary timescale of the Colias fieldii populations in China estimated by relaxed molecular dating method 

based on the concatenated sequence of mitochondrial genes COI, Cytb, NDI and NDS 
Origin species of COI, Cytb, NDI and NDS and their GenBank accession numbers: Colias montium, MN305669, MN305670, KM669350, MN305671 ; 
Colias erate, KP715146, KP715146, KP715146, KP715146; Colias croceus, KM592967 , KM592967 , KM592967, KM592967; Catopsilia pomona , 
DQ842501, JX274649, GU462136, JX274649; Gonepteryx mahaguru, GU372557, KF881048, KF881048, KF881048; Gonepteryx rhamni, 
MF536894, MF536894, MF536894, MF536894; Eurema hecabe, KC257480, KC257480, KC257480, KC257480; Artogeia melete, EU597124, 
EUS97124, EU597124, EU597124; Pieris rapae, GQ398376, GQ398376, GQ398376, GQ398376; Aporia intercostata, KC461928, KC461928, 
KC461928 , KC461928; Aporia hippia, KX495166, KX495166, KX495166, KX495166. The confidence intervals are shown by blue bars on the 
phylogram and fossil calibration points shown by red asterisks. Q; Quaternary, PL; Pliocene. 


4 DISCUSSION 


4.1 Genetic diversity and population differentiation 

Nucleotide 
provide information about an organism’ s population 
structure and history. Our results revealed that C. 
fieldii had a relatively high level of haplotype 
diversity (Hd >0. 500) and a low level of nucleotide 
diversity (7 <0.005) (Grant and Bowen, 1998). 
In addition, the values of haplotype and nucleotide 
diversities of clade I were lower than those of clade 
II, and the fact might be attributed to the genetic 
drift or 
populations or individuals. Furthermore, the genetic 
differentiation between the two clades I and II was 
mainly associated with a wide range of ecological 


diversities can 


and haplotype 


inbreeding depression among clade I 


types, as well as the complex mountain barriers 
(niche variations) (Slatkin and Maddison, 1989; 
Wright, 1990). 

The haplotype analysis showed that among all 


the geographic populations of C. fieldii, the Yunnan 
population possessed the most haplotypes, including 
the unique and the most broadly distributed ones, 
such as haplotypes A7 and A15. The fact suggests 
that the southwestern Yunnan area probably be the 
diversification centre of C. fieldii in China, which is 
in agreement with the common view that the 
Hengduan Mountains is one of the most significant 
biodiversity hotspots in the world, with high levels of 
species richness and endemism (Yang et al., 2009; 
Che et al., 2010; Song et al., 2010; Liao et al., 
2016 ). differentiation 


analyses are needed to better understand the way in 


However, more genetic 
which palaeogeographical and palaeoclimatic events 
shaped the 


patterns of C. fieldii populations. The small sample 


distribution and phylogeographical 
size of this study is also one of the reasons for this 
result, so we need to examine more populations in 
the future. 
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4.2 Divergence time estimation 

Our phylochronological results (Fig. 3) generally 
agreed with those of Braby et al. (2005), Braby 
and Trueman (2006) and Cao et al. (2016), who 
determined the divergence between Coliadinae and 
Pierinae dated at about 86.34 Ma (95% CI =83.59 — 
88.83 Ma) in the Late Cretaceous, but significantly 
differed from the results obtained by Dong et al. 
(2019), who provided dates of the same event at 
about 71.5 Ma (95% CI = 93.6 -49.4 Ma) (Late 
Cretaceous). We found that the major source of 
differences in dating the tree came from calibration 
points and their distribution in the tree. In this 
analysis, we adopted, in our calibration system 
(Fig. 3), high-quality fossil dates of Pieris and 
Aporia and relevant dates from the host-plants 
( Brassicales ) based on the insect-host plant 
coevolutionary scenario, which were likely better 
proxies approaching the true reference time frame. 
Therefore, the divergence time in this study might be 
earlier than the divergence time estimated without 
the restriction of host plant fossils. Although the time 
obtained in this study was slightly different from the 
time proposed by other scholars, there was still a 
large range overlap for the time estimates of 95% 
confidence interval. 

Our results revealed that the two clades of C. 
fieldii diverged at about 0.48 Ma during the longest 
interglacial time after the Quaternary Wangkun 
glacial period (0. 50 - 0. 72 Ma, according to 
marine isotope stages, MIS 16 -20) (Cui et al., 
1998) and the Southeast Asia moonson happened 
about 0. 57 - 0. 48 Ma. This time of increasing 
warmness and wetness accompanied by mountain 
forest expansion might have driven some populations 
to disperse from the margin into the core area of 
QTP, which colonized at new suitable habitats. In 
addition, some populations later dispersed into other 
low-latitude areas due to the Quaternary glacial- 
cycle and Southeast Asia 
monsoon. All these factors eventually caused the 


interglacial events 
complete diversification of the two clades distributed 
at high- and low-altitudes. That is, the low-altitude 
populations should experience less isolation, while 
high-altitude populations were more isolated in 
general (Lu et al., 2012). Additionally, the genetic 
differentiation of C. fieldii in this study caused by 
geographical isolation could be further reinforced and 
accumulated due to ecological divergence over time, 
and finally the two clades inhabited differently and 
evolved independently into two genetically diversified 
clades (Yan et al., 2010; Zhan et al., 2011). 


Besides, an obvious phylogeographic structure 


of C. fieldii detected in this study is somewhat 
attributed to their relatively lower flight ability 
compared with other animals that harbor a stronger 
ability of movement (Gratton et al., 2008; Wheat 
and Watt, 2008 ). 


dispersal capacity tends to cause the gene flows to 


In these cases, such a low 


decrease continuously, thus promoting the genetic 
divergences among populations. 
4.3 Demographic history 

Previous studies have shown that compared with 
the drastic climate oscillations and glacial cycles in 
Europe and North America ( Hewitt, 2004; Schmitt, 
2007), a relatively mild Pleistocene climate and 
lack of ice sheets were presented in many areas of 
East Asia (Ju et al., 2007). Population expansion 
of many species in Europe and North America 
occurred after LGM ( Last Glacial 
Maximum ) , while the population expansions of many 


usually 


widespread species occurred before LGM in East 
Asia (Huang et al., 2010). Our study showed that 
the two clades of C. fieldii might have taken different 
pathways in their evolutionary histories; clade II 
experienced an obvious population expansion event at 
about 0.085 Ma in the late Pleistocene interglacial 
period before the LGM, while clade I did not show 
Thus, the 
demographic expansion of clade II during this period 


any signs of population expansion. 


was consistent with previous reports concerning some 
birds and mammals in East Asia, such as the Chinese 
bamboo partridge Bambusicola thoracica (Huang et al., 
2010) , Chinese hwamei Leucodioptron canorum (Li 
et al., 2009), and tufted deer Elaphodus 
cephalophus (Sun et al., 2016). 

The high haplotype diversity and low nucleotide 
diversity also indicated that the clade II of C. fieldii 
might have undergone a rapid population expansion 
and accumulation of variation after the bottleneck 
effect. Accordingly, the discordance in evolutionary 
histories of the two clades of C. fieldii could be 
habitat 
Pleistocene 


accounted for the differentiation of 


distributions. During the extensive 
glaciation period, most regions of Qinghai and Tibet 
might have been heavily covered with ice, but the 
plain areas remained relatively ice-free, under which 
condition milder climate might have mitigated 
demographic stresses for the populations of plain 
areas relative to the extremes experienced by the 
QTP populations ( Zhang et al., 2000; Qu et al., 
2010). Additionally, these different evolutionary 
histories were probably caused by the different 
environmental events they experienced, that is, 
during late Pleistocene interglacial period (MISS) , 


the reinforced winter monsoon and decreased heavy 
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rainfall on the core plateau synergistically promoted 
the population expansion of clade II. Similar cases 
were also reported in birds, plants, and other insects 
(Wang et al., 2010; Lei et al., 2014; Ye et al., 
2016). 
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HE: | AO) SAP Be SSR Colias fieldii 种 群 遗 传 多 样 性 和 遗传 分 化 情况 及 其 系统 发 生 关 
A ,推测 其 起 源 及 分 化 时 间 , 并 探讨 其 历史 生物 地 理 分 布 格局 的 成 因 。【 方 法 】 对 2006 -2018 年 采 
集 的 中 国 23 个 橙黄 豆 粉 蝶 地 理 种 群 的 115 头 个 体 样品 的 4 个 线粒体 基因 (COIJ，Cytb,，NDI 和 
NDS) 序列 进行 PCR 扩 增 和 测序 ;采用 MEGA v.7.0, DnaSP v.5.0 和 Arlequin v. 3.5.1 等 软件 分 析 
其 遗传 多 样 性 和 遗传 分 化 情况 ; 以 其 他 近 缘 豆 粉 蝶 种 类 作为 外 类 群 , 采 用 IQ-TREE，MrBayes 
v.3.1.2, Network v.4.6 fe BEAST v. 1.8.3 等 软件 重建 橙黄 豆 粉 蝶 的 系统 发 生 树 和 单 倍 型 网 络 图 ， 
并 使 用 宽松 分 子 钟 以 及 前 人 的 时 间 标 定 推测 橙黄 豆 粉 蝶 的 起 源 和 分 化 时 间 ; 结 合 现今 橙黄 豆 粉 蝶 
的 生物 地 理 分 布 特点 和 第 四 纪 以 来 的 地 球 环 境 背 景 ,探讨 其 历史 生物 地 理 分 布 格局 及 成 因 。【 结 
RIERA A NRR A (COI, Cytb, NDI 和 NDS) 片段 长 度 分 别 为 648，699 ,393 和 
777 bp, 这 4 个 基因 的 串联 序列 总 长 为 2 517 bp, 具 有 明显 的 AT 偏 倚 特 征 。 基 于 4 个 线粒体 基因 
序列 , 供 试 橙黄 豆 粉 蝶 23 个 地 理 种 群 115 头 个 体 中 共 检 测 出 18 个 单 倍 型 ,总 群体 的 单 倍 型 多 样 性 
(Hd) 404K BRS AEE (ar) 分 别 为 0.677 +0. 048 和 0. 00066 +0. 00007 ,呈现 出 较 高 的 单 倍 型 多 样 
性 和 较 低 的 核 苦 酸 多 样 性 。 系 统 发 育 分 析 表 明 ,橙黄 豆 粉 蝶 种 群 的 18 个 单 倍 型 分 为 2 个 具有 明显 
地 理 分 布 格局 的 支 系 ( 工 和 开 ), 支 系 工 包含 13 个 单 倍 型 ,主要 来 自 陕 西 、 河南、 甘肃 安徽 、 湖 北 、 
四 川 \ 青 海 及 云南 部 分 地 区 的 种 群 ; 支 系 开 由 5 个 单 倍 型 组 成 ,主要 来 自 云南 部 分 地 区 及 西藏 的 种 
群 ; 单 倍 型 网 络 图 与 系统 发 生 树 结果 一 致 。AMOVA 分 析 结 果 表 明 , 大 部 分 的 种 群 遗传 分 化 
(64. 36% ) 来 自 于 橙黄 豆 粉 蝶 种 群 两 单 倍 型 支 系 间 ,各 分 支 内 的 遗传 分 化 较 小 (35. 649% ) 。 中 性 检 
验 和 错 配 分 布 分 析 结 果 显 示 , 单 倍 型 支 系 工 的 种 群 未 发 生 过 种 群 扩张 事件 ,而 单 倍 型 支 系 开 的 种 群 
在 历史 上 发 生 过 种 群 扩张 事件 ,扩张 事件 发 生 的 时 间 (0.085 Ma) 位 于 末次 冰 盛 期 前 的 间 冰 期 ,这 
一 结果 可 能 是 由 于 间 冰 期 温暖 湿润 的 高 原 气 候 以 及 森林 草原 扩张 、 强 降雨 减少 等 因素 造成 的 。 
【结论 】 橙 黄豆 粉 蝶 种 群 的 遗传 分 化 与 地 理 距离 之 间 存 在 明显 的 相关 性 。 其 祖先 可 能 在 距 今 48 万 
年 前 起 源 于 我 国 的 西南 地 区 (现今 横断 山区 一 带 ) ,之 后 ,由 于 第 四 纪 冰 期 - 间 冰 期 轮回 事件 、 东 南亚 
季风 以 及 栖息 地 环境 的 影响 而 分 化 为 2 大 支 系 并 逐渐 向 低 海拔 地 区 扩散 。 
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